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Improved message passing for inference in densely connected systems 
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An improved inference method for densely connected systems is presented. The approach is based 
on passing condensed messages between variables, representing macroscopic averages of microscopic 
messages. We extend previous work that showed promising results in cases where the solution space 
is contiguous to cases where fragmentation occurs. We apply the method to the signal detection 
problem of Code Division Multiple Access (CDMA) for demonstrating its potential. A highly efficient 
practical algorithm is also derived on the basis of insight gained from the analysis. 


Graphical models (Bayes belief networks) provide a 
powerful framework for modelling statistical dependen¬ 
cies between variables 0,SS- They play an essential 
role in devising a principled probabilistic framework for 
inference in a broad range of applications from medical 
expert systems, to decoders in telecommunication sys¬ 
tems. 

Message passing techniques are typically used for in¬ 
ference in graphical models that can be represented by a 
sparse graph with a few (typically long) loops. They are 
aimed at obtaining (pseudo) posterior estimates for the 
system’s variables by iteratively passing messages (locally 
calculated conditional probabilities) between variables. 
Iterative message passing of this type is guaranteed to 
converge to the globally correct estimate when the sys¬ 
tem is tree-like; there are no such guarantees for systems 
with loops even in the case of large loops and a local 
tree-like structure (although message passing techniques 
have been used successfully in loopy systems, supported 
by some limited theory 0). A clear link has been estab¬ 
lished between certain message passing algorithms and 
well known methods of statistical mechanics jij such as 
the Bethe approximation @,[3- 

These inherent limitations seem to prevent the use of 
message passing techniques in densely connected systems 
due to their high connectivity, implying an exponentially 
growing cost, and an exponential number of loops. How¬ 
ever, an exciting new approach has been recently sug¬ 
gested 18] for extending Belief Propagation (BP) tech¬ 
niques |1, Si to densely connected systems. In this 
approach, messages are grouped together, giving rise to 
a macroscopic random variable, drawn from a Gaus¬ 
sian distribution of varying mean and variance for each 
of the nodes. The technique has been successfully ap¬ 
plied to signal detection in Code Division Multiple Ac¬ 
cess (CDMA) problems and the results reported are com¬ 
petitive with those of other state of the art techniques. 
However, the current approach has some inherent limita¬ 
tions ;gj, presumably due to its similarity to the replica 
symmetric solution in equivalent Ising spin models !). I(l|. 

In a separate recent development [lj|, the replica- 
symmetric-equivalent BP has been extended to Survey 
Propagation (SP), which corresponds to one-step replica 
symmetry breaking in diluted systems. This new algo¬ 
rithm, motivated by the theoretical physics interpreta¬ 
tion of such problems, has been highly successful in solv¬ 
ing hard computational problem ll(, far beyond other 


existing approaches. In addition, the algorithm facili¬ 
tated theoretical studies of the corresponding ph ysic al 
system and contributed to our understanding of it [12]. 

Inspired by the extension of BP to SP we have ex¬ 
tended the approach of 0, designed for inference in 
densely connected systems, in a similar manner to in¬ 
clude an average over multiple pure states. In this ar¬ 
ticle we derive this extension, apply it to the problem 
of CDMA signal detection |ff] and devise a practical al¬ 
gorithm based on insight gained from the analysis. The 
approach is general and can be applied to a broad range 
of inference problems. However, for giving a specific ex¬ 
ample and highlighting the advantages with respect to 
the original method [gj] we will focus here on the appli¬ 
cation to CDMA signal detection. 

Multiple access communication refers to the transmis¬ 
sion of multiple messages to a single receiver. The sce¬ 
nario we study here is that of K users transmitting inde¬ 
pendent messages over an additive white Gaussian noise 
(AWGN) channel of zero mean and variance <Tq. Vari¬ 
ous methods are in place for separating the messages, in 
particular Time, Frequency and Code Division Multiple 
Access 0. The latter, is based on spreading the signal 
by using K individual random binary spreading codes of 
spreading factor N. We consider the large-system limit, 
in which the number of users K tends to infinity while 
the system load f3 = K/N is kept to be 0(1). We fo¬ 
cus on a CDMA system using binary phase shift keying 
(BPSK) symbols and will assume the power is completely 
controlled to unit energy. The received aggregated, mod¬ 
ulated and corrupted signal is of the form: 
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cron^ 


where bk is the bit transmitted by user k, s^k is the 
spreading chip value, n,, is the Gaussian noise variable 
drawn from A/"(0,1), and y M the received message. The 
goal is to get an accurate estimate of the vector b for all 
users given the received message vector y by approximat¬ 
ing the posterior P(b|y). A method for obtaining a good 
estimate of the posterior probability in the case where 
the noise level is accurately known has been presented 
in j|. However, the calculation is based on finding a sin¬ 
gle solution and is therefore bound to fail, as have been 
observed,when the solution space becomes fragmented, 
for instance when the noise level is unknown, a case that 
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Figure 1: Replicated solutions B = (bi, b 2 ,b k) given data. 


arguably corresponds to replica symmetry breaking. 

The reason for the failure in this case can be qualita¬ 
tively understood by the same arguments as in the case 
of sparse graphs; the existence of competing solutions re¬ 
sults in inconsistent messages and prevent the algorithm 
from converging to an accurate estimate. An improved 
solution can therefore be obtained by averaging over the 
different solutions, inferred from the same data, in a man¬ 
ner reminiscent to the SP approach, only that the mes¬ 
sages in the current case are more complex. 

Figure ^ shows the detection problem we aim to solve 
as a bipartite graph where B = (bi, b 2 , ..., b k) the 
set of bit vectors, b^ = (b k , 6 2 , ..., b k ), where n is the 
solution (replica) index. 

Using Bayes rule one obtains the BP equations: 

P t+1 (y,\b k ,{y^}) = aH 1 5>(^|B) 

b i^k 
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where a^ 1 and a^ fc are normalization constants. For 
calculating the posterior 
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an expression representing the likelihood is required and 
is easily derived from the noise model (assuming zero 
mean and variance cr 2 ) 
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An explicit expression for inter-dependence between so¬ 
lutions is required for obtaining a closed set of update 
equations. We assume a dependence of the form 

pt (bfc I {y^}) exp jh^ bfc + ib^Q^ fe b fc |, (4) 


where h^ fc is a vector representing an external field and 
Q^ fe the matrix of cross-replica interaction. Furthermore, 
we assume the following symmetry between replica: 

(Q^) at = S ab 9* fc + (l-<5 ab K fe (5) 

h U = h U u - 

An expression for equation immediately follows 


p (bfc| {y^fj,}) 



where Z t , is a normalization constant. 

We expect the free energy obtained from the well be¬ 
haved distribution P l to be self-averaging, thus 

lim - log (z* k ) = lim - log (Z* fc {h* k , q 0 , Po)) , 

n—> oo 77, \ p / n— kx> fl ^ ^ 


where the sub-index 0 represents the mean value of the 
parameters when extracted for some suitable distribu¬ 
tions and the overline represents the mean value of the 
partition function over such distributions. 

To obtain the scaling behavior of the various parame¬ 
ters we calculate Z ( h , q , p) explicitly, assuming the pa¬ 
rameters q and p are taken from normal distributions 
■A f q (q 0 , cr 2 ) and J\f p (poj Vp) ■ After a long calculation one 
obtains the following scaling: h ~ 0(1), qo ~ 0(1), 
Po ~ O (n^ 1 ), cr 2 ~ O (?z _1 ), and cr 2 ~ O (n~ 3 ). In the 
remainder of the paper we will rescale the off-diagonal 
elements of Q^ fc to g^Jn, where (1). 

The marginalized posterior at time t takes the form 
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To find the dominant solutions in the case of large n one 
studies the maxima of $ (x; h, g). One identifies regimes 
with a single and double peaks, depending on the values 
of h and g (full details will be given elsewhere); the main 
contribution comes from a regime where > 1 and 
0 < b lk/9lk ^ 1; where $ (x\ h, g) takes the form of an 
almost symmetric pair of Gaussians located at 
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where ±£o are the positions of the peaks at zero field. 

To calculate correlation between replica we expand 
P | B) in the large N limit (Eq. 0 , as in jgj, to obtain 
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where A„ k = A= J^i^k ■V b *- 

For large n and small field we obtain the following 
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where m^ k = tanh (x^j = x^Jg^ k , and 
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Using Eqs. m we calculate the first two cummulants of 
the elements of A^*,: 


( A U) = (12) 
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where and R^ k can be approximated using the law 
of large numbers as 
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The definition of T 4 ifc relies on the expected scaling of 
the off-diagonal terms of the matrix y 4 fc . 

Thus, we expect the variables A^, k to obey a Gaussian 
distribution defined in Eqs. m. The mean value of b k at 
time t+1 is then given by: 

mt ^k = (°' 2 +/5 (l-Q^kj+PYk) 

, (13) 


where = (1 / F[)s tik s^i) and I = S k i, respectively. We 
assume that the macroscopic variables are self averaging 
and omit the n, k indices. 

The main difference between Eq. Cl and the equiv¬ 
alent equation in j8j is the emergence of an extra term 
in the prefactor, /3T 4 , reflecting correlations between dif¬ 
ferent solutions groups (replica). To determine this term 
we optimize the choice of Y 4 by minimizing the bit error 
at each time step. Following |£j] we define 

JV K 

M ‘ = mEEKp 

[ a =1 k —1 
N K 

Q* - m YY 0 * m U) a = 

[1=1 k =1 


J Vz tanh 2 ^Vf^z+E 1 ^ , 



where Vz = dz exp [— z 2 /2\ / \[2i r and 

N K 

K 


Et+1 = ?EE^ 1= 


fi= 1 k =1 


N 


F , +I s j- 


[ 1=1 


K 

Y 

k =1 


<7 2 +/?(l-Q* + T*) 


K 


ft 1 ) -^[Y^KV 

\fe=i / 


i * 

- 


m t+1 • m t+1 

xxx [i ±±x /i 


(15) 


[i=i 


= [/3 (1 - 2 M* + Q 4 ) + a 2 ] (E t+l ) 2 , 


To obtain the bit error rate: 
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Optimizing P 6 4 with respect to Y 4 one obtains straight¬ 
forwardly that P 4 = F t: and Q t = M 4 . In principle, the 
optimization can be done globally [14] but is of a limited 
practical value. 

This implies that Y 4 = (ctq — <t 2 )//3 is just a constant. 
However, it holds the key to obtaining accurate inference 
results. If the noise estimate is identical to the true noise 
the term vanishes and one retrieves the expression of [$]; 
otherwise, an estimate of the difference between the two 
noise values is required for computing P 4 . 

From equation Cl one obtains: 
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Figure 2: Error probability of the inferred solution evolving 
in time. The system load /3 = 0.25, true noise level <Jq = 0.25 
and estimated noise cr 2 = 0.01. Squares represent results of 
the original algorithm $], solid line the dynamics obtained 
from our equations; circles represent results obtained from 
the suggested practical algorithm. Variances are smaller than 
the symbol size. In the inset, D l , a measure of convergence 
in the obtained solutions, as a function of time; symbols are 
as in the main figure. 

This enables us to rewrite Eq. m as 

(19) 

{is—r 

where no estimate on <to is required. 

The inference algorithm requires an iterative update 
of Eqs- (gaunt and converges to a reliable estimate of 
the signal, with no need for an accurate prior information 
of the noise level. The computational complexity of the 
algorithm is of (D(K 2 ). 

To test the performance of our algorithm we carried 
out a set of experiments of CDMA signal detection prob¬ 
lem under typical conditions. Error probability of the 
inferred signals has been calculated for a system load of 
/3 = 0.25, where the true noise level is ct 2 = 0.25 and the 
estimated noise is tr 2 = 0.01, as shown in Figure [2] The 
solid line represents the expected theoretical results (den¬ 
sity evolution), knowing the exact values of dp and cr 2 , 
while circles represent simulation results obtained via the 
suggested practical algorithm, where no such knowledge 
is assumed. The results presented are based on 10 5 trials 
per point and a system size N = 2000 and are superior to 


those obtained using the original algorithm jg]. 

Another performance measure one should consider is 

D t = 1 (m 4 - m 4-1 ) • (m 4 - m 4 " 1 ) , 

that provides an indication to the stability of the solu¬ 
tions obtained. In the inset of Figure Q we see that re¬ 
sults obtained from our algorithm show convergence to 
a reliable solution in stark contrast to the original algo¬ 
rithm Q. The physical interpretation of the difference 
between the two results is assumed to be related to a 
replica symmetry breaking phenomena. 

In summary, we present a new algorithm for using be¬ 
lief propagation in densely connected systems that en¬ 
ables one to obtain reliable solutions even when the so¬ 
lution space is fragmented. It represents an extension to 
existing algorithms of that type which is reminiscent to 
the extension of BP to SP. The algorithm has been tested 
on the signal detection problem in CDMA and has pro¬ 
vided superior results to other existing algorithms jl8LT5l]. 
Further research is required to fully determine the poten¬ 
tial of the new algorithm. 
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